Statistically validated hierarchical clustering: Nested partitions in hierarchical trees
نویسندگان
چکیده
We develop an algorithm that is fast and scalable in the detection of a nested partition extracted from dendrogram obtained hierarchical clustering multivariate series. Our provides p-value for each clade observed tree. The by computing many bootstrap replicas dissimilarity matrix performing statistical test on difference between associated with given its parent node. prove efficacy our set benchmarks generated hierarchically factor model. compare results those Pvclust. Pvclust widely-used pursuing global approach originally developed context phylogenetic studies. In numerical experiments, we focus role multiple hypothesis correction robustness algorithms to inaccuracies errors datasets. verify much faster than has better scalability both number elements records investigated set. also apply two empirical datasets, one related biological complex system other financial time-series. clusters detected methodology are meaningful respect some consensus partitioning
منابع مشابه
Hierarchical clustering in minimum spanning trees.
The identification of clusters or communities in complex networks is a reappearing problem. The minimum spanning tree (MST), the tree connecting all nodes with minimum total weight, is regarded as an important transport backbone of the original weighted graph. We hypothesize that the clustering of the MST reveals insight in the hierarchical structure of weighted graphs. However, existing theori...
متن کاملDependent nonparametric trees for dynamic hierarchical clustering
Hierarchical clustering methods offer an intuitive and powerful way to model a wide variety of data sets. However, the assumption of a fixed hierarchy is often overly restrictive when working with data generated over a period of time: We expect both the structure of our hierarchy, and the parameters of the clusters, to evolve with time. In this paper, we present a distribution over collections ...
متن کاملHIERARCHICAL DATA CLUSTERING MODEL FOR ANALYZING PASSENGERS’ TRIP IN HIGHWAYS
One of the most important issues in urban planning is developing sustainable public transportation. The basic condition for this purpose is analyzing current condition especially based on data. Data mining is a set of new techniques that are beyond statistical data analyzing. Clustering techniques is a subset of it that one of it’s techniques used for analyzing passengers’ trip. The result of...
متن کاملHierarchical Clustering of Trees: Algorithms and Experiments
We focus on the problem of experimentally evaluating the quality of hierarchical decompositions of trees with respect to criteria relevant in graph drawing applications. We suggest a new family of tree clustering algorithms based on the notion of t-divider and we empirically show the relevance of this concept as a generalization of the ideas of centroid and separator. We compare the t-divider b...
متن کاملImproved initialisation of model-based clustering using Gaussian hierarchical partitions
Initialisation of the EM algorithm in model-based clustering is often crucial. Various starting points in the parameter space often lead to different local maxima of the likelihood function and, so to different clustering partitions. Among the several approaches available in the literature, model-based agglomerative hierarchical clustering is used to provide initial partitions in the popular mc...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Physica D: Nonlinear Phenomena
سال: 2022
ISSN: ['1872-8022', '0167-2789']
DOI: https://doi.org/10.1016/j.physa.2022.126933